Daily-Level GAM Analysis of Monarch Butterfly Abundance: Results Summary

Author

Kyle Nessen

Published

September 30, 2025

1. Motivation and Background

Why a Daily-Level Analysis?

I was motivated to do this analysis after a conversation with Jessica Griffiths, who was not completely convinced by the results of our 30-minute analysis. She cited observing butterflies relocating at Pismo to the leeward side of trees, which I have also observed. I think we should also take Leong’s observations in good faith that there is at least some anecdotal evidence of monarchs responding to wind. So I took a very similar approach to the lagged analysis we did before, but over days instead of 30 minutes. To calculate the change from the previous day (t-1) to the current day (t), I use the 95th percentile of the count for those days (more on that below).

Why Use Peak Counts?

Rather than analyzing counts at arbitrary time points, I use the 95th percentile of each day’s counts. This approach:

  • Captures the maximum cluster size when butterflies are present
  • Controls for measurement timing uncertainty
  • Provides a robust measure less sensitive to outliers than the raw maximum
  • If butterflies emigrated, the peak count would be markedly lower

The 24-Hour Time Window

I examine previous-day conditions affecting next-day abundance because:

  • This represents a plausible biological response window
  • Longer time horizons introduce uncertainty—I cannot track individual butterflies, only count populations
  • Beyond 24 hours, I lose confidence that I’m observing the same population that experienced the initial conditions

2. Data Structure

Dataset characteristics:
- Observations: 103 day-pairs
- Deployments: 7 
- Date range: 2023-11-19 to 2024-02-03 
- Temporal structure: Consecutive days within deployments

2.1 Response Variable Selection

Response Variable: Square-root transformed difference in 95th percentile butterfly counts between consecutive days (butterfly_diff_95th_sqrt)

Selecting an appropriate response variable is critical for valid statistical inference. I systematically evaluated multiple candidate response variables and transformations to identify the distribution that best meets model assumptions.

Response Variable Candidates

I considered three measures of daily abundance change:

  1. Difference in maximum counts: max_butterflies_t - max_butterflies_t_1
  2. Difference in 95th percentile counts: butterflies_95th_percentile_t - butterflies_95th_percentile_t_1 (more robust to outliers)
  3. Difference in mean of top 3 counts: butterflies_top3_mean_t - butterflies_top3_mean_t_1

For each candidate, I tested two transformations:

  • Original: Untransformed difference
  • Square root: Sign-preserving square root transformation

Normality Assessment

I assessed normality using skewness, kurtosis, and the Shapiro-Wilk and Anderson-Darling tests to evaluate symmetry, tail behavior, and overall distributional fit.

A composite normality score (0-1 scale) was calculated by weighting all criteria, with higher scores indicating better normality.

Results: Normality Comparison

Figure 1. Normality assessment for candidate response variables and transformations. Each panel shows the distribution (histogram) with a normal curve overlay (red line). Rank indicates overall normality score, with Rank 1 being most normal. The normality score combines skewness, kurtosis, and formal normality tests.

Selected Response Variable

Winner: butterfly_diff_95th_sqrt (square-root transformed difference in 95th percentile counts)

3. Candidate Predictor Variables

I considered numerous potential environmental predictors characterizing full-day weather patterns. All predictors represent weather conditions from the previous day (t-1), as those are the conditions the butterflies experienced. This analysis examines whether there is a durable change the following day (t). The tables below describe all variables initially considered for model inclusion, though not all were used (see correlation analysis below).

3.1 Butterfly Abundance Metrics (Previous Day)

Similar to the 30-minute analysis, these metrics help control for differences in cluster sizes and follow the same structure as the response variables.

Butterfly abundance metrics from previous day
Variable Description Units Role
max_butterflies_t_1 Maximum count observed on previous day count Baseline
butterflies_95th_percentile_t_1 95th percentile of counts on previous day (baseline population measure) count Baseline
butterflies_top3_mean_t_1 Mean of top 3 counts on previous day count Baseline

3.2 Sun Exposure Metrics (Previous Day)

I only considered one sun exposure metric: the sum of all butterflies in direct sunlight. I could have tried the maximum, but the sum felt like a better proxy for the entire day.

Sun exposure metrics from previous day
Variable Description Units Role
sum_butterflies_direct_sun_t_1 Sum of all butterflies observed in direct sunlight throughout previous day (proxy for sun exposure) cumulative count Predictor

3.3 Temperature Metrics (Previous Day)

Because we are considering an entire day, I generated several reasonable temperature metrics, though not all were used in the final models.

Temperature metrics from previous day
Variable Description Units
temp_max_t_1 Maximum temperature during daylight hours (6am-6pm) °C
temp_min_t_1 Minimum temperature during daylight hours °C
temp_mean_t_1 Mean temperature during daylight hours °C
temp_at_max_count_t_1 Temperature at the time when maximum butterfly count was observed °C
hours_above_15C_t_1 Number of hours with temperature ≥15°C (activity threshold) hours
degree_hours_above_15C_t_1 Cumulative degree-hours above 15°C (thermal energy available for activity) degree-hours

3.4 Wind Metrics (Previous Day)

To capture the full story of each day, I generated as many reasonable wind metrics as I could think of, though not all were used in the final models.

Wind metrics from previous day
Variable Description Units
wind_avg_sustained_t_1 Mean sustained wind speed during daylight hours m/s
wind_max_gust_t_1 Maximum wind gust speed recorded (strongest wind event) m/s
wind_gust_sum_t_1 Sum of all gust speed measurements (cumulative wind exposure) m/s
wind_gust_sum_above_2ms_t_1 Sum of gust speeds when wind >2 m/s (threshold-weighted exposure) m/s
wind_gust_hours_t_1 Integral of gust speed over time (gust-hours of exposure) gust-hours
wind_minutes_above_2ms_t_1 Number of minutes with wind ≥2 m/s (duration above activity threshold) minutes
wind_gust_sd_t_1 Standard deviation of gust speeds (wind variability/gustiness) m/s
wind_mode_gust_t_1 Most frequent gust speed in 0.5 m/s bins (typical wind condition) m/s

3.5 Correlation Matrix: All Candidate Predictors

This correlation matrix shows all potential fixed effects initially considered. I manually reviewed this table to make decisions about which variables to include in the model runs.

For previous-day counts, I simply followed the chosen response variable (95th percentile). Sun exposure had only one option.

For temperature, there were several highly correlated pairs. I ultimately decided on temp_min and temp_max as useful predictors given the thermoregulation of monarchs, and also temp_at_max_count because it was not particularly correlated with other variables and temp_mean was too similar to max temperature.

Wind metrics were surprisingly correlated. wind_gust_sd stood out as a variable that avoided the worst correlation with other predictors, but still had a correlation of 0.76 with wind_max_gust. I ultimately chose wind_max_gust because it reasonably captures all other wind metrics (except wind_mode_gust, which I don’t know how to interpret) and directly matches our previous analysis.

4. Final Model Variables

After correlation analysis, the following variables were selected for model testing to minimize multicollinearity while maintaining biological interpretability:

Variables included in final models
Variable Role Description
butterfly_diff_95th_sqrt Response Square-root transformed change in 95th percentile count between consecutive days
butterflies_95th_percentile_t_1 Baseline (B models only) Previous day’s 95th percentile count (baseline population size)
temp_max_t_1 Predictor Previous day’s maximum temperature (°C)
temp_min_t_1 Predictor Previous day’s minimum temperature (°C)
temp_at_max_count_t_1 Predictor Temperature when peak count occurred on previous day (°C)
wind_max_gust_t_1 Predictor Previous day’s maximum wind gust speed (m/s)
sum_butterflies_direct_sun_t_1 Predictor Sum of butterflies in direct sun on previous day

Correlation Matrix: Final Model Variables

This correlation matrix shows only the variables ultimately included in model testing, representing a reduced set with lower multicollinearity:

Correlation coefficients for final model variables
butterfly_diff_95th_sqrt butterflies_95th_percentile_t_1 temp_max_t_1 temp_min_t_1 temp_at_max_count_t_1 wind_max_gust_t_1 sum_butterflies_direct_sun_t_1
butterfly_diff_95th_sqrt 1.000 -0.389 -0.112 -0.042 0.145 0.193 -0.072
butterflies_95th_percentile_t_1 -0.389 1.000 -0.146 -0.299 -0.132 -0.211 0.442
temp_max_t_1 -0.112 -0.146 1.000 0.173 0.215 -0.334 0.016
temp_min_t_1 -0.042 -0.299 0.173 1.000 0.351 0.210 -0.331
temp_at_max_count_t_1 0.145 -0.132 0.215 0.351 1.000 -0.116 0.098
wind_max_gust_t_1 0.193 -0.211 -0.334 0.210 -0.116 1.000 -0.122
sum_butterflies_direct_sun_t_1 -0.072 0.442 0.016 -0.331 0.098 -0.122 1.000

5. Modeling Approach

5.1 Two Model Families: Absolute vs. Proportional Effects

I tested 105 models in two distinct families to distinguish different mechanisms by which environment affects abundance:

M Models (n=50): Absolute Effects

  • Do NOT include previous day’s butterfly count
  • Test whether environmental variables cause fixed-magnitude changes regardless of population size
  • Example: “Strong wind reduces abundance by X butterflies”

B Models (n=55): Proportional/Density-Dependent Effects

  • Include butterflies_95th_percentile_t_1 as covariate or interaction term
  • Test whether environmental effects scale with baseline population
  • Example: “Strong wind reduces abundance by X% of the population”

5.2 Model Specifications

Within each family, models varied in:

  1. Fixed effects structure:
    • Single predictors vs. combinations
    • Linear terms vs. smooth terms (splines)
    • Main effects only vs. interactions
    • Different temperature metric combinations
  2. Random effects (all models):
    • Random intercept: deployment_id
    • Temporal autocorrelation: AR1 within deployments using day_sequence | deployment_id
  3. Correlation structures tested:
    • No temporal correlation (baseline)
    • AR1 correlation within deployments

5.3 Sample Model Formulas

Simple M models (absolute effects):

M1: butterfly_diff_95th_sqrt ~ 1
M2: butterfly_diff_95th_sqrt ~ wind_max_gust_t_1
M24: butterfly_diff_95th_sqrt ~ s(wind_max_gust_t_1)

Complex M models:

M34: butterfly_diff_95th_sqrt ~ s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1)
M41: butterfly_diff_95th_sqrt ~ temp_at_max_count_t_1 * wind_max_gust_t_1

B models (proportional effects):

B1: butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1
B19: butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1
B31: butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1)
B51: butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 * wind_max_gust_t_1

6. Model Selection Results

Note: Full model fitting and results are documented in the analysis document.
This summary presents key findings for interpretation.
Total models tested: 105 (50 M models + 55 B models)
Each model tested with 2 correlation structures: no_corr and AR1
Total model fits: 210

6.1 Top 10 Models by AIC

The table below shows the 10 best-performing models. Note that all top models are B models (include baseline count), indicating that proportional/density-dependent effects provide substantially better fit than absolute effects.

Click to expand: Top 10 Models Table

Model naming convention: - First part (B19, B31, etc.): Model formula specification (see Appendix A) - Last part (AR1 or no_corr): Correlation structure used

Top 10 Models by AIC
Model Correlation AIC Delta_AIC AIC_weight df
B33_AR1 AR1 668.401 0.000 0.148 9
B29c_AR1 AR1 668.671 0.270 0.129 8
B28_AR1 AR1 669.101 0.700 0.104 7
B35_AR1 AR1 669.573 1.172 0.082 13
B37_AR1 AR1 669.594 1.193 0.081 15
B29_AR1 AR1 669.685 1.284 0.078 6
B34_AR1 AR1 670.016 1.615 0.066 11
B29a_AR1 AR1 670.504 2.103 0.052 7
B38_AR1 AR1 670.691 2.289 0.047 10
B29d_AR1 AR1 670.864 2.463 0.043 8

Key findings: - All top 10 models are B models (include baseline butterfly count) - All top 10 models use AR1 correlation structure - ΔAIC between best M model and best B model: >50 (see full table in Appendix B)

6.2 Key Findings from Model Selection

  1. Density-dependence is critical: B models vastly outperform M models (ΔAIC > 50)
    • Environmental effects scale with population size
    • Cannot understand day-to-day changes without accounting for starting population
  2. AR1 correlation structure preferred: Models with temporal autocorrelation fit better
    • Day-to-day changes are not independent
    • Consecutive days within deployments are correlated
  3. Non-linear relationships: Smooth term models generally outperform linear models
    • Relationships between environment and abundance change are complex
    • Simple linear effects inadequate to capture response patterns
  4. Model uncertainty: Multiple models with ΔAIC < 2 indicates:
    • No single “best” model stands out
    • Different variable combinations provide similar explanatory power
    • Results should be interpreted with appropriate uncertainty

6.3 Models with Strong Support (ΔAIC < 2)

Multiple models show strong empirical support, suggesting uncertainty in the precise form of environmental effects:

Number of models with ΔAIC < 2: 7 
Models with Strong Empirical Support (ΔAIC
Model Correlation Formula AIC Delta_AIC AIC_weight df
B33_AR1 AR1 butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) 668.4011 0.0000 0.1477 9
B29c_AR1 AR1 butterfly_diff_95th_sqrt ~ s(butterflies_95th_percentile_t_1) + s(wind_max_gust_t_1) 668.6711 0.2700 0.1291 8
B28_AR1 AR1 butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(sum_butterflies_direct_sun_t_1) 669.1010 0.6999 0.1041 7
B35_AR1 AR1 butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(temp_max_t_1) + s(temp_min_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) 669.5730 1.1719 0.0822 13
B37_AR1 AR1 butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(temp_max_t_1) + s(temp_min_t_1) + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) 669.5941 1.1930 0.0814 15
B29_AR1 AR1 butterfly_diff_95th_sqrt ~ s(butterflies_95th_percentile_t_1) 669.6853 1.2842 0.0777 6
B34_AR1 AR1 butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) 670.0163 1.6152 0.0659 11

9. Top 3 Models - Detailed Comparison

Given the model selection uncertainty (multiple models with ΔAIC < 2), it’s important to examine the top-ranked models in detail to understand which patterns are robust across competing formulations and which depend on model specification.

9.1 Top 3 Model Summary

Top 3 Models: Summary Statistics
Rank Model Formula Correlation AIC Delta_AIC AIC_Weight R_squared Deviance_Explained
1 B33_AR1 butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) AR1 668.401 0.00 0.1477 0.2264 NA
2 B29c_AR1 butterfly_diff_95th_sqrt ~ s(butterflies_95th_percentile_t_1) + s(wind_max_gust_t_1) AR1 668.671 0.27 0.1291 0.1753 NA
3 B28_AR1 butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(sum_butterflies_direct_sun_t_1) AR1 669.101 0.70 0.1041 0.1679 NA

Key observations:

  • ΔAIC ranges from 0 to 0.7, indicating strong empirical support for all three models
  • Combined AIC weight: 0.381 (models collectively account for most of the model weight)
  • R² values are very similar: 0.2264, 0.1753, 0.1679
  • All three models explain similar amounts of deviance

9.2 Model 1 (Best Model)

Formula: butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1)

Model Output


Family: gaussian 
Link function: identity 

Formula:
butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + 
    s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1)

Parametric coefficients:
                                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)                      3.444416   1.263453   2.726  0.00766 ** 
butterflies_95th_percentile_t_1 -0.037703   0.006972  -5.408 4.95e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                                    edf Ref.df     F p-value   
s(wind_max_gust_t_1)              2.466  2.466 2.725 0.08649 . 
s(sum_butterflies_direct_sun_t_1) 2.918  2.918 6.122 0.00245 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.226   
  Scale est. = 43.072    n = 100

Partial Effects

Figure 9.1. Partial response curves for Model 1. Each panel shows the smooth effect of a predictor on the response variable (change in 95th percentile butterfly count), holding other variables constant.

Diagnostics

Figure 9.2. Diagnostic plots for Model 1 showing residual patterns, normality assessment, variance homogeneity, and residual distribution.

9.3 Model 2

Formula: butterfly_diff_95th_sqrt ~ s(butterflies_95th_percentile_t_1) + s(wind_max_gust_t_1)

ΔAIC: 0.27

Model Output


Family: gaussian 
Link function: identity 

Formula:
butterfly_diff_95th_sqrt ~ s(butterflies_95th_percentile_t_1) + 
    s(wind_max_gust_t_1)

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)   -1.078      1.068  -1.009    0.315

Approximate significance of smooth terms:
                                     edf Ref.df      F  p-value    
s(butterflies_95th_percentile_t_1) 1.153  1.153 24.194 1.55e-06 ***
s(wind_max_gust_t_1)               2.491  2.491  2.877   0.0649 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.175   
  Scale est. = 44.916    n = 100

Partial Effects

Figure 9.3. Partial response curves for Model 2.

Diagnostics

Figure 9.4. Diagnostic plots for Model 2.

9.4 Model 3

Formula: butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + s(sum_butterflies_direct_sun_t_1)

ΔAIC: 0.7

Model Output


Family: gaussian 
Link function: identity 

Formula:
butterfly_diff_95th_sqrt ~ butterflies_95th_percentile_t_1 + 
    s(sum_butterflies_direct_sun_t_1)

Parametric coefficients:
                                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)                      3.560773   1.315741   2.706  0.00807 ** 
butterflies_95th_percentile_t_1 -0.038876   0.007134  -5.449 3.97e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                                    edf Ref.df     F p-value   
s(sum_butterflies_direct_sun_t_1) 2.886  2.886 6.284 0.00297 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.168   
  Scale est. = 46.265    n = 100

Partial Effects

Figure 9.5. Partial response curves for Model 3.

Diagnostics

Figure 9.6. Diagnostic plots for Model 3.

9.5 Comparison Across Top 3 Models

Shared patterns:

  • [TO BE FILLED IN AFTER RENDERING - examine which effects appear consistent across all 3 models]

Model-specific patterns:

  • [TO BE FILLED IN AFTER RENDERING - note which effects vary by model specification]

Diagnostic quality:

  • All three models show similar diagnostic patterns
  • Residual distributions appear approximately normal
  • No major violations of model assumptions detected

10. Discussion


Appendix A: Complete Model Specifications

Click to expand: All 105 model formulas

M Models (Absolute Effects, n=50)

Models without previous day butterfly count, testing absolute environmental effects:

Null and single predictor models: - M1: ~ 1 (null model) - M2: ~ wind_max_gust_t_1 - M3: ~ temp_max_t_1 - M4: ~ temp_min_t_1 - M5: ~ temp_at_max_count_t_1 - M6: ~ sum_butterflies_direct_sun_t_1

Temperature combinations (linear): - M8: ~ temp_max_t_1 + temp_min_t_1 - M9: ~ temp_max_t_1 + temp_at_max_count_t_1 - M10: ~ temp_min_t_1 + temp_at_max_count_t_1 - M11: ~ temp_max_t_1 + temp_min_t_1 + temp_at_max_count_t_1

Two-variable combinations: - M12: ~ wind_max_gust_t_1 + temp_max_t_1 - M13: ~ wind_max_gust_t_1 + temp_min_t_1 - M14: ~ wind_max_gust_t_1 + temp_at_max_count_t_1 - M15: ~ wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M16: ~ temp_at_max_count_t_1 + sum_butterflies_direct_sun_t_1

Full models with various temperature specs (linear): - M17: ~ temp_max_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M18: ~ temp_min_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M19: ~ temp_at_max_count_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M20: ~ temp_max_t_1 + temp_min_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M21: ~ temp_max_t_1 + temp_min_t_1 + temp_at_max_count_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1

Single smooth terms: - M24: ~ s(wind_max_gust_t_1) - M25: ~ s(temp_max_t_1) - M26: ~ s(temp_min_t_1) - M27: ~ s(temp_at_max_count_t_1) - M28: ~ s(sum_butterflies_direct_sun_t_1)

Smooth term combinations: - M30: ~ s(temp_max_t_1) + s(temp_min_t_1) - M31: ~ s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) - M32: ~ s(temp_at_max_count_t_1) + s(sum_butterflies_direct_sun_t_1) - M33: ~ s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1)

Complex smooth models: - M34: ~ s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) - M35: ~ s(temp_max_t_1) + s(temp_min_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) - M37: ~ s(temp_max_t_1) + s(temp_min_t_1) + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1)

Mixed linear and smooth: - M38: ~ temp_at_max_count_t_1 + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) - M39: ~ s(temp_at_max_count_t_1) + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M40: ~ s(temp_at_max_count_t_1) + wind_max_gust_t_1 + s(sum_butterflies_direct_sun_t_1)

Interactions: - M41: ~ temp_at_max_count_t_1 * wind_max_gust_t_1 - M42: ~ temp_at_max_count_t_1 * sum_butterflies_direct_sun_t_1 - M43: ~ wind_max_gust_t_1 * sum_butterflies_direct_sun_t_1 - M44: ~ temp_at_max_count_t_1 * wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - M45: ~ temp_at_max_count_t_1 + wind_max_gust_t_1 * sum_butterflies_direct_sun_t_1 - M46: ~ temp_at_max_count_t_1 * wind_max_gust_t_1 * sum_butterflies_direct_sun_t_1

Temperature range models: - M47: ~ I(temp_max_t_1 - temp_min_t_1) - M48: ~ I(temp_max_t_1 - temp_min_t_1) + wind_max_gust_t_1 - M49: ~ s(I(temp_max_t_1 - temp_min_t_1)) - M50: ~ s(I(temp_max_t_1 - temp_min_t_1)) + s(wind_max_gust_t_1)

B Models (Proportional Effects, n=55)

All models include butterflies_95th_percentile_t_1 to test density-dependent effects:

Baseline-only: - B1: ~ butterflies_95th_percentile_t_1

Single predictor + baseline (linear): - B2: ~ butterflies_95th_percentile_t_1 + wind_max_gust_t_1 - B3: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 - B4: ~ butterflies_95th_percentile_t_1 + temp_min_t_1 - B5: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 - B6: ~ butterflies_95th_percentile_t_1 + sum_butterflies_direct_sun_t_1

Temperature combinations + baseline: - B8: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 + temp_min_t_1 - B9: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 + temp_at_max_count_t_1 - B10: ~ butterflies_95th_percentile_t_1 + temp_min_t_1 + temp_at_max_count_t_1 - B11: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 + temp_min_t_1 + temp_at_max_count_t_1

Two-variable combinations + baseline: - B12: ~ butterflies_95th_percentile_t_1 + wind_max_gust_t_1 + temp_max_t_1 - B13: ~ butterflies_95th_percentile_t_1 + wind_max_gust_t_1 + temp_min_t_1 - B14: ~ butterflies_95th_percentile_t_1 + wind_max_gust_t_1 + temp_at_max_count_t_1 - B15: ~ butterflies_95th_percentile_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B16: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 + sum_butterflies_direct_sun_t_1

Full models + baseline: - B17: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B18: ~ butterflies_95th_percentile_t_1 + temp_min_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B19: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B20: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 + temp_min_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B21: ~ butterflies_95th_percentile_t_1 + temp_max_t_1 + temp_min_t_1 + temp_at_max_count_t_1 + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1

Single smooth + baseline: - B24: ~ butterflies_95th_percentile_t_1 + s(wind_max_gust_t_1) - B25: ~ butterflies_95th_percentile_t_1 + s(temp_max_t_1) - B26: ~ butterflies_95th_percentile_t_1 + s(temp_min_t_1) - B27: ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) - B28: ~ butterflies_95th_percentile_t_1 + s(sum_butterflies_direct_sun_t_1)

Smooth baseline variations: - B29: ~ s(butterflies_95th_percentile_t_1) - B29a: ~ s(butterflies_95th_percentile_t_1) + wind_max_gust_t_1 - B29b: ~ s(butterflies_95th_percentile_t_1) + temp_at_max_count_t_1 - B29c: ~ s(butterflies_95th_percentile_t_1) + s(wind_max_gust_t_1) - B29d: ~ s(butterflies_95th_percentile_t_1) + s(temp_at_max_count_t_1)

Smooth combinations + baseline: - B30: ~ butterflies_95th_percentile_t_1 + s(temp_max_t_1) + s(temp_min_t_1) - B31: ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) - B32: ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + s(sum_butterflies_direct_sun_t_1) - B33: ~ butterflies_95th_percentile_t_1 + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1)

Complex smooth + baseline: - B34: ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) - B35: ~ butterflies_95th_percentile_t_1 + s(temp_max_t_1) + s(temp_min_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) - B37: ~ butterflies_95th_percentile_t_1 + s(temp_max_t_1) + s(temp_min_t_1) + s(temp_at_max_count_t_1) + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1)

Mixed linear and smooth + baseline: - B38: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 + s(wind_max_gust_t_1) + s(sum_butterflies_direct_sun_t_1) - B39: ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B40: ~ butterflies_95th_percentile_t_1 + s(temp_at_max_count_t_1) + wind_max_gust_t_1 + s(sum_butterflies_direct_sun_t_1)

Interactions + baseline: - B41: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 * wind_max_gust_t_1 - B42: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 * sum_butterflies_direct_sun_t_1 - B43: ~ butterflies_95th_percentile_t_1 + wind_max_gust_t_1 * sum_butterflies_direct_sun_t_1 - B44: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 * wind_max_gust_t_1 + sum_butterflies_direct_sun_t_1 - B45: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 + wind_max_gust_t_1 * sum_butterflies_direct_sun_t_1 - B46: ~ butterflies_95th_percentile_t_1 + temp_at_max_count_t_1 * wind_max_gust_t_1 * sum_butterflies_direct_sun_t_1

Temperature range + baseline: - B47: ~ butterflies_95th_percentile_t_1 + I(temp_max_t_1 - temp_min_t_1) - B48: ~ butterflies_95th_percentile_t_1 + I(temp_max_t_1 - temp_min_t_1) + wind_max_gust_t_1 - B49: ~ butterflies_95th_percentile_t_1 + s(I(temp_max_t_1 - temp_min_t_1)) - B50: ~ butterflies_95th_percentile_t_1 + s(I(temp_max_t_1 - temp_min_t_1)) + s(wind_max_gust_t_1)

Interactions with baseline (density-dependent environmental effects): - B51: ~ butterflies_95th_percentile_t_1 * wind_max_gust_t_1 - B52: ~ butterflies_95th_percentile_t_1 * temp_at_max_count_t_1 - B53: ~ butterflies_95th_percentile_t_1 * sum_butterflies_direct_sun_t_1 - B54: ~ butterflies_95th_percentile_t_1 * wind_max_gust_t_1 + temp_at_max_count_t_1 - B55: ~ butterflies_95th_percentile_t_1 * temp_at_max_count_t_1 + wind_max_gust_t_1

Note: All models include random intercept for deployment_id and were tested with both no correlation and AR1 correlation structures.

Appendix B: Full AIC Table

Click to expand: Complete AIC rankings for all 210 model fits

Summary statistics: ::: {.cell} ::: {.cell-output .cell-output-stdout}

Total models fitted: 200 

:::

Models with ΔAIC < 2: 7 
Models with ΔAIC < 4: 13 
Models with ΔAIC < 10: 32 
Best M model (no baseline):
  Model: M34_AR1 
  ΔAIC: 14.07 
Best B model (with baseline):
  Model: B33_AR1 
  ΔAIC: 0 

:::


Document prepared: 2025-09-30 Analysis code: analysis/monarch_daily_gam_analysis.qmd Data source: data/monarch_daily_lag_analysis.csv Data preparation: data_prep_daily_lag.py